How to remove duplicate rows in SQL?
How to remove duplicate rows in SQL?
500
12-Apr-2023
Updated on 26-Nov-2023
Aryan Kumar
26-Nov-2023To remove duplicate rows from a table in SQL, you can use the DISTINCT keyword or the GROUP BY clause. Here are two common approaches:
Using DISTINCT:
If you want to select distinct rows based on all columns, you can use the DISTINCT keyword:
This query will return only distinct rows across all columns in the specified table.
Using GROUP BY:
If you want to remove duplicates based on specific columns, you can use the GROUP BY clause:
Replace column1, column2, ..., columnN with the columns you want to consider for uniqueness. This query will return one row for each unique combination of the specified columns.
Removing Duplicates and Keeping One Copy:
If you want to delete the duplicate rows and keep only one copy, you can use the DELETE statement with a common table expression (CTE) and the ROW_NUMBER() window function:
In this example, the ROW_NUMBER() function assigns a unique number to each row within its partition. The PARTITION BY clause specifies the columns for determining duplicates. The DELETE statement then removes rows with RowNum greater than 1, effectively keeping only one copy of each unique combination.
Remember to replace column1, column2, ..., columnN with the actual column names in your table. Also, be cautious when performing deletions, especially if the table contains important data. Consider taking a backup before making changes, or test the query on a smaller dataset first.